Toward self-healing distributed services

نویسنده

  • Raja R. Sambasivan
چکیده

Modern clouds and datacenters are rapidly growing in scale and complexity. They contain an eclectic set of hardware, networks, applications, and virtual machines that give rise to complex emergent behaviour, unobservable when components are tested individually. Problem diagnosis in these environments is especially difficult, as it can require developers to have expert knowledge about every detail of the datacenter. For example, Google engineers diagnosing a distributed service’s performance must often have intricate knowledge of the service, its dependencies (e.g., GFS, BigTable, the authentication mechanism, and the network topology), its configuration (e.g., the machines on which it’s running and its resource allocation), its critical paths, and the many co-located applications that might be interfering with it. To keep problems in these environments from becoming completely undiagnosable, many researchers believe the only recourse is to build self-healing systems, capable of fixing problems automatically without human intervention.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Self-healing components in robust software architecture for concurrent and distributed systems

This paper describes an approach to designing self-healing component for robust, concurrent and distributed software architecture. A self-healing component is able to detect object anomalies inside of the component, reconfigure inter-component and intra-components before and after repairing the sick object, repair it, and then test the healed object. For this, each self-healing component is str...

متن کامل

Performance Evaluation of Self-healing Functionality in Atm-networks: Reform Perspective Performance Evaluation of Self-healing Funtionality in Atm Networks: the Reform Perspective

In modern networks, advanced and novel concepts are being deployed for the provision of protected services to the end user. Amongst these, the self-healing concept refers to the capability of the network to restore services in a distributed way without any immediate operator intervention, this in order to assure delivery of constant quality services under fault conditions. In this paper, the Se...

متن کامل

Self-healing in payment switches with a focus on failure detection using State Ma- chine-based approaches

Composition, change and complexity have attracted ev- eryone’s attention towards Self-Adaptive systems. These systems, inspired by the human body, are capable of adapting to changes in the inner and outer environment. The main objective of this study is to achieve a more con- venient availability for e-banking services in the payment switch, using self-healing systems and focusing on the failur...

متن کامل

Self-healing in payment switches with a focus on failure detection using State Ma- chine-based approaches

Composition, change and complexity have attracted ev- eryone’s attention towards Self-Adaptive systems. These systems, inspired by the human body, are capable of adapting to changes in the inner and outer environment. The main objective of this study is to achieve a more con- venient availability for e-banking services in the payment switch, using self-healing systems and focusing on the failur...

متن کامل

A Self-healing Cycle for Web Service Composition

The Web service paradigm allows applications to interact electronically with one another over the Internet. Standards and languages, such as BPEL and OWL-S provide a platform with which Web services can be integrated. Moreover, various AI planning techniques have also been adapted to integrate services. However, the autonomous and distributed nature of an integrated service presents unique chal...

متن کامل

On Conditions for Self-Healing in Distributed Software Systems

This paper attempts to identify one of the necessary conditions for self-healing, or self-repair, in complex systems, and to propose means for satisfying this condition in heterogeneous distributed software. The condition identified here is the following: For a system with a wide and open range of possible configurations to be self healing, it must possess suitable regularities, which can be re...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011